Character Recognition of Arabic and Latin Scripts

نویسندگان

  • Fiaz Hussain
  • John Cowell
چکیده

The goal to produce effective Optical Character Recognition (OCR) methods has lead to the development of a number of algorithms. The purpose of these is to take the hand-written or printed text and to translate it into a corresponding digital form. The multitude requirements and developments are well represented in the literature (see for example Abuhaiba [1] and Suen [2]). The primary objective of this paper is to provide an insight into a robust system which has been successfully developed and employed to recognise Latin and Arabic characters and whose workings has been described by the authors in a sister publication [3]. The focus here is to discuss the main components used in the multi-stage system, paying particular attention to the normalisation process used for orientation and size for a given bitmapped character. The effectiveness of the approach is demonstrated through its workings for the Arabic and Latin case, both for characters and numbers.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Off-Line Arabic Handwriting Character Recognition Using Word Segmentation

The ultimate aim of handwriting recognition is to make computers able to read and/or authenticate human written texts, with a performance comparable to or even better than that of humans. Reading means that the computer is given a piece of handwriting and it provides the electronic transcription of that (e.g. in ASCII format). Two types of handwriting: on-line and offline. The most important pu...

متن کامل

A Finite State Model for Urdu Nastalique Optical Character Recognition

Finite state technology is being used since long to model NLP (Natural Language Processing) applications specially it has very successfully applied to machine translation and speech recognition systems. Character recognition in cursive scripts or handwritten Latin script also have attracted researchers’ attention and some research is also done in this area. Optical character recognition is the ...

متن کامل

Acquisition Segmentation Feature Extraction Classification Post Processing Pre - Processing

Arabic script is the third most widely used writing system after Latin and Chinese, but research in Arabic Optical Character Recognition (OCR) is still nascent in comparison to Latin script. Arabic script is inherently cursive in nature, therefore techniques developed for other scripts are generally inappropriate for Arabic. In this paper we present recent progress in the field of Handwritten A...

متن کامل

Generalization of Hindi OCR Using Adaptive Segmentation and Font Files

In this chapter, we describe an adaptive Indic OCR system implemented as part of a rapidly retargetable language tool effort and extend work found in [20, 2]. The system includes script identification, character segmentation, training sample creation, and character recognition. For script identification, Hindi words are identified in bilingual or multilingual document images using features of t...

متن کامل

A Brief Study of Feature Extraction and Classification Methods Used for Character Recognition of Brahmi Northern Indian Scripts

According to the 8th schedule of Indian constitution, there are 22 official languages and 122 regional languages prevalent in India. In the last few decades, the recognition of these scripts has been prominent area of research. Among these scripts most of the recognition research work has been done for Bangla, Devanagari, Gujrati, Gurumukhi and Telugu scripts etc. Commercial OCRs were available...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000